Randomized Support Vector Forest
نویسندگان
چکیده
Based on the structural risk minimization principle, the linear SVM aiming at finding the linear decision plane with the maximal margin in the input space has gained increasing popularity due to its generalizability, efficiency and acceptable performance. However, rarely training data are evenly distributed in the input space [1], which leads to a high global VC confidence [3], downgrading the performance of the linear SVM classifier. Partitioning the input space in tandem with local learning may alleviate the unevenly data distribution problem. However, the extra model complexity introduced by partitioning frequently leads to overfitting. To solve this problem, we proposed a new supervised learning algorithm, Randomized Support Vector Forest (RSVF): Many partitions of the input space are constructed with partitioning regions amenable to the corresponding linear SVMs. As illustrated in Figure 1, the RSVF consists of many Support Vector Trees (SVT). Each SVT represents a scheme of data partition and the corresponding local classifier. The final classification result of RSVF is a pooling from all the SVTs. After comparing various pooling methods including the majority voting, and max voting, i.e., taking the prediction from the SVT with the maximal confidence, we use majority voting from all of the trees in the forest for its simplicity and efficacy. We grow the RSVF through a procedure similar to growing the Classification And Regression Trees (CART) in random forest [7]. The steps of building RSVF is shown in Algorithm 1.
منابع مشابه
Prognosis of multiple sclerosis disease using data mining approaches random forest and support vector machine based on genetic algorithm
Background: Multiple sclerosis (MS) is a degenerative inflammatory disease which is most commonly diagnosed by magnetic resonance imaging (MRI). But, since the MRI device uses of a magnetic field, if there are metal objects in the patient's body, it can disrupt the health of the patient, the functioning of the MRI, and distortion in the images. Due to limitations of using MRI device, screening ...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملPredicting the cause of kidney stones in patients using random forest, support vector machine and neural network
Background: Today, with the advancement of technology in various fields, the importance of recording data in the field of health is increasing so much that for many diseases around the world, including kidney disease, registration systems have been set up. This is happening in our country and in the future, the number of these systems will increase. The medical data set contains valuable inform...
متن کاملImprovement of Support Vector Machine and Random Forest Algorithm in Predicting Khorramabad River Flow Uusing Non-uniform De-Noising of data and Simplex Algorithm
In this study, in order to simulate the monthly flow of the Khorramabad River, the time series of this river was decomposed into three levels using the wavelet of Daubechies-3, during the period of 1955-2014. Based on this, it was found that there is a Non-uniform noise that includes two periods of time in this signal, with the October 2008 border which required that the signal be become non-un...
متن کاملInvestigation of the Forest and Pasture Cover Changes in Arasbaran Ecosystem during 34 years, Using Remote Sensing Technique
Estimating the extent of changes in forest and rangelands land cover, leads to a clear understanding of the growth or decline of these natural areas and planning for effective protection of these national assets. The aim of current study was to reveal the trend of land-use changes in the Dizmar protected area of Arasbaran vegetative area, using MSS sensor of Landsat-5 for 1984, ETM+ sensor of L...
متن کامل